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The emergence of viruses such as severe acute respiratory syndrome coronavirus and Nipah virus has 
underscored the role of animal reservoirs in human disease and the need for reservoir surveillance. Here, we 
used a panviral DNA microarray to investigate the death of a captive beluga whale in an aquatic park. A highly 
divergent coronavirus, tentatively named coronavirus SW1, was identified in liver tissue from the deceased 
whale. Subsequently, the entire genome of SW1 was sequenced, yielding a genome of 31,686 nucleotides. 
Phylogenetic analysis revealed SW1 to be a novel virus distantly related to but most similar to group III 
coronaviruses. 


An estimated 75% of emerging diseases arise from zoonotic 
sources (30). Zoological parks and aquariums provide a unique 
opportunity for emerging virus surveillance. For example, in 
1999, the first harbinger of West Nile virus emergence in North 
America was the mysterious death of birds at the Bronx Zoo/ 
Wildlife Conservation Park (25). Thus, zoo populations may 
serve as sentinels for emerging viruses. 

Panviral DNA microarrays represent one approach for mas¬ 
sively parallel viral surveillance. We have previously described 
a panviral DNA microarray (ViroChip) capable of detecting 
thousands of known viruses as well as novel viruses related to 
known viral families in a single assay (35). ViroChip has pre¬ 
viously been used to identify severe acute respiratory syndrome 
(SARS) coronavirus (19, 35); xenotropic murine leukemia vi¬ 
rus-related virus, a novel human retrovirus, in patients with 
familial prostate cancer (32); and a novel clade of human 
rhinoviruses (16). 

In this paper, a ViroChip was used to interrogate primary 
liver tissue from a recently deceased beluga whale for the 
presence of viruses. Microarray hybridization strongly sug¬ 
gested that a coronavirus was present in the liver tissue. Sub¬ 
sequent complete genome sequencing and phylogenic analysis 
revealed that the virus was a novel, highly divergent coronavi¬ 
rus most similar overall to group 3 coronaviruses. We have 
tentatively named this virus coronavirus SW1. 

Clinical history and necropsy results. A 13-year-old, male, 
captive-born beluga whale died after a short medical illness 
characterized by generalized pulmonary disease and terminal 
acute liver failure. The liver demonstrated a diffuse increased 
friability with multifocal, red-yellow mottling and irregularly 
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shaped areas of obvious necrosis (Fig. 1A). Flistological exam¬ 
ination of liver tissue demonstrated a severe, multifocal, and 
coalescing centrilobular-to-massive acute hepatic necrosis 
(data not shown). To study the liver in more detail, conven¬ 
tional transmission electron microscopy was performed as pre¬ 
viously described (12). Abundant nondescript round viral par¬ 
ticles measuring ~60 to 80 nm with cores of approximately 45 
to 50 nm were identified in the cytoplasm, but this was insuf¬ 
ficient to identify the virus (Fig. IB). We note that while the 
observed particles were smaller than those typically associated 
with coronaviruses, coronavirus particles as small as 50 nm 
have been reported (26). 

Virus isolation attempts. Liver tissue homogenate was inoc¬ 
ulated into bovine turbinate, Vero, MARC 145, primary fetal 
porcine kidney, rabbit kidney (RK-13Ky), BHK, bovine em¬ 
bryonic testicle, MDCK, bovine pulmonary arterial endothe¬ 
lium, and human rectal tumor 18 cells and embryonating 
chicken eggs. No evidence of viral growth was observed. 

Panviral DNA microarray analysis. RNA was extracted 
from liver tissue samples of the infected and two control, un¬ 
infected whales. Two hundred nanograms of RNA was ran¬ 
domly amplified and hybridized to the panviral microarray as 
previously described (35). Multiple oligonucleotides derived 
from various coronaviruses gave strong hybridization intensity 
in the infected liver, suggesting the presence of a coronavirus 
in the infected liver. 

Consensus coronavirus PCR and complete genome sequenc¬ 
ing. To confirm the microarray findings, reverse transcription- 
PCR (RT-PCR) was performed with published consensus 
coronavirus primers (9). A PCR product of 454 bp that pos¬ 
sessed 70% amino acid identity with the lab replicase polypro¬ 
tein of avian infectious bronchitis virus as determined by 
tBLASTx was obtained (2). The entire viral genome was sub¬ 
sequently sequenced using shotgun sequencing, RT-PCR, and 
5' and 3' rapid amplification of cDNA ends. The initial assem¬ 
bly was confirmed by sequencing a series of overlapping RT- 
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FIG. 1. (A) Explanted liver of the dead Beluga whale. (B) Electron 
microscopy of whale liver at X 129,300 magnification. Bar, 100 nm. 


PCR products, yielding the finished genome of 31,686 nucle¬ 
otides (nt). 

Analysis of viral ORFs. Putative open reading frames 
(ORFs) were predicted using NCBI’s ORF Finder (37) and the 
results refined using information about the noncoding se¬ 
quence of SW1 to determine the most likely start sites. SW1 
contained 14 putative ORFs (Table 1), including ORFs with 
similarity to the five major ORFs conserved in all known coro- 
naviruses (Table 1 and Fig. 2). SW1 encoded eight putative 
accessory proteins whose genes were located between the M 
and N genes. None of these proteins had any detectable se¬ 
quence similarity to proteins in other known coronaviruses, 
and their functional roles are currently unknown. A number of 
the ORFs had noteworthy features. The ORF 6 protein pos¬ 
sessed amino acid similarity (BLAST le—06) to human astro- 
virus capsid proteins. Astrovirus capsid proteins have recently 
been demonstrated to disrupt tight junctions and thereby in¬ 
crease the barrier permeability of polarized cell monolayers, 
resulting in increased viral dissemination (27). The ORF 10 
protein had significant amino acid similarity to a number of 
uridine kinases (BLAST 2e—26); no virus described to date 
encodes a uridine kinase (11). In addition, since some viruses 


encode secreted proteins that interfere with the host immune 
response (1, 33), the accessory proteins were analyzed for the 
presence of signal sequences by using SignalP (4). The ORF 7 
and 8 proteins contained putative signal sequences, suggesting 
that they may be secreted. Finally, analysis with PolyPhobius 
(15) suggested that among the accessory genes, ORFs 5b and 
9 contained transmembrane domains. 

Analysis of noncoding viral RNA. Known coronavirus 5' 
untranslated regions (UTRs) range from 209 to 528 nt (6), 
including a leader sequence of 65 to 98 nt. In SW1, the 5' UTR 
was 523 nt, with a leader sequence of 79 nt. Typically, the final 
7 to 18 nt of the leader form the transcription-regulating se¬ 
quence (TRS) motif, which defines the 5' end of each sub- 
genomic RNA. Using MEME (3), a putative TRS motif was 
identified. This was experimentally confirmed by amplifying 
the 5' ends of the mRNAs for the N and S genes, using a 
primer in the putative leader sequence and a second primer 
within each gene. Comparison of this amplified sequence to 
the genomic sequence confirmed that the TRS motif was 5' A 
AACA. Ten of the ORFs were immediately preceded by a 
TRS consensus sequence (Table 1 and Fig. 3). While ORFs 3, 
5b, and 5c were not preceded by a TRS sequence, internal 
translation from subgenomic RNAs has been described for 
coronaviruses (18, 20). The SW1 3' UTR of 369 nt fell within 
the known size range for other coronaviruses (288 to 506 nt) 
(6, 29). 

Phylogenetic analysis reveals that SW1 is a novel, highly 
divergent coronavirus. Coronaviruses are classified based on 
genomic organization and phylogenetic analysis of full-length 
genomes (13). Phylogenetic analysis of the five major ORFs by 
use of ClustalX VI.83 (31) (Fig. 2A to E) demonstrated that 
overall SW1 was most closely related to group III coronavi¬ 
ruses. In addition, its genomic organization was also most 
similar to that of known group III coronaviruses (Fig. 3). 

The emergence of SARS in 2003 marked a renaissance in 
the field of coronavirology. Since then, new members of the 
family Coronaviridae have been identified in birds (14, 24), 
humans (34, 38), bats (21, 22, 28, 39), and wild mammals from 
Chinese live-animal markets (8). In this study, we identified a 
novel coronavirus in the liver tissue of a deceased beluga 


TABLE 1. Predicted ORFs 


ORF 

Predicted size 
(aa) of 
protein 

Presence of 
TRS 
sequence 

Distance (nt) 
from TRS to 
ATG 

% Identity with 
avian infectious 
bronchitis 
virus" 

la 

3,955 

Yes 

447 

25 

lab 

6,664 



42 

2 (S) 

1,473 

Yes 

30 

21 

3(E) 

96 



27 

4 (M) 

261 

Yes 

62 

32 

5a 

139 

Yes 

0 

NA 

5b 

173 



NA 

5c 

176 



NA 

6 

229 

Yes 

1 

NA 

7 

162 

Yes 

0 

NA 

8 

60 

Yes 

7 

NA 

9 

153 

Yes 

0 

NA 

10 

211 

Yes 

0 

NA 

11 (N) 

380 

Yes 

105 

35 


a NA, not applicable. 
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FIG. 2. Phylogenetic analysis of SW1. Phylogenetic trees were constructed from protein sequences by using the neighbor-joining method with 1,000 
bootstrap replicates. Abbreviations: PEDV, porcine epidemic diarrhea virus; NL63, human coronavirus NL63; FCoV, feline coronavirus; TGEV, 
transmissible gastroenteritis virus; 229E, human coronavirus 229E; BCoV, bovine coronavirus; MHV, murine hepatitis virus strain JHM; SARS, SARS 
coronavirus; BtCoV, bat coronavirus (BtCoV/133/2005); HEV, porcine hemagglutinating encephalomyelitis virus; HKU1, human coronavirus HKU1; 
OC43, human coronavirus OC43; IBV, infectious bronchitis virus; TCoV, turkey coronavirus; CFBCoV F250, Chinese ferret badger coronavirus 
Guangxi/F250/2006; CFBCoV F247, Chinese ferret badger coronavirus Guangxi/F247/2006; ALCCoV, Asian leopard cat coronavirus Guangxi/F230/ 
2006. The accession numbers of the sequences used are found in Table SI in the supplemental material. (A) ORF lab; (B) spike; (C) envelope; 
(D) membrane; (E) nucleocapsid. 
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FIG. 3. Genome organization of SW1. Diagrammatic representation of the 5'-to-3' arrangement of ORFs of the genome of SW1. The TRSs 
are indicated by black circles. 


whale. While there is a report of immunohistochemical stain¬ 
ing of the small intestine tissue from harbor seals (Phoca vitu- 
lina) with acute necrotizing enteritis with antisera to group I 
coronaviruses (5), this is the first description of the complete 
genome sequence of a coronavirus found in a marine mammal. 

The detection of a novel coronavirus in a deceased beluga 
whale raises a number of questions, including whether beluga 
whales are the natural host for this virus and whether the virus 
was pathogenic to the whale. There is precedence for animal 
coronaviruses causing hepatic pathology (23, 36). In addition, 
SARS and HKU1 may be associated with liver disease and 
hepatitis (7,10). Thus, the liver damage seen during the beluga 
whale necropsy (Fig. 1A) may have been caused by SW1 in¬ 
fection, although this remains to be experimentally verified. 
Furthermore, it is not yet clear whether beluga whales are the 
natural host, an amplifying host, or a dead-end host for SW1. 

In conclusion, we have used a ViroChip to identify a novel 
coronavirus directly from primary animal tissues. Furthermore, 
the identification of a previously unrecognized virus in a cap¬ 
tive animal underscores the vast diversity of viruses that re¬ 
mains unexplored in animals. These viruses have the potential 
to be transmitted to humans or other animals, with significant 
implications for human and animal health. Continued system¬ 
atic surveillance of animal populations in zoos and aquaria is 
key for public health preparedness for future outbreaks. 

Accession numbers. Primary microarray data have been de¬ 
posited in NCBI GEO under accession number GSE9238. The 
nucleotide sequence for the SW1 genome was deposited in 
GenBank under accession number EU111742. 
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collection, and Fred Murphy for advice with electron micrographs. 
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